Minimum Chi-square Estimation
   HOME

TheInfoList



OR:

In statistics, minimum chi-square estimation is a method of estimation of unobserved quantities based on observed data. In certain chi-square tests, one rejects a
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
about a population distribution if a specified test statistic is too large, when that statistic would have approximately a chi-square distribution if the null hypothesis is true. In minimum chi-square estimation, one finds the values of parameters that make that test statistic as small as possible. Among the consequences of its use is that the test statistic actually does have approximately a
chi-square distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-square ...
when the
sample size Sample size determination is the act of choosing the number of observations or Replication (statistics), replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make stat ...
is large. Generally, one reduces by 1 the number of
degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
for each parameter estimated by this method.


Illustration via an example

Suppose a certain
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
takes values in the set of non-negative integers 1, 2, 3, . . . . A
simple random sample In statistics, a simple random sample (or SRS) is a subset of individuals (a sample) chosen from a larger set (a population) in which a subset of individuals are chosen randomly, all with the same probability. It is a process of selecting a sample ...
of size 20 is taken, yielding the following data set. It is desired to
test Test(s), testing, or TEST may refer to: * Test (assessment), an educational assessment intended to measure the respondents' knowledge or other abilities Arts and entertainment * ''Test'' (2013 film), an American film * ''Test'' (2014 film), ...
the
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
that the population from which this sample was taken follows a
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known co ...
. : \begin \text & \text \\ \hline 0 & 1 \\ 1 & 2 \\ 2 & 4 \\ 3 & 5 \\ 4 & 3 \\ 5 & 3 \\ 6 & 1 \\ 7 & 0 \\ 8 & 1 \\ >8 & 0 \end The
maximum likelihood estimate In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statisti ...
of the population average is 3.3. One could apply
Pearson's chi-square test Pearson's chi-squared test (\chi^2) is a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is the most widely used of many chi-squared tests (e.g. ...
of whether the population distribution is a Poisson distribution with
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
 3.3. However, the null hypothesis did not specify that it was that particular Poisson distribution, but only that it is some Poisson distribution, and the number 3.3 came from the data, not from the null hypothesis. A rule of thumb says that when a parameter is estimated, one reduces the number of
degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
by 1, in this case from 9 (since there are 10 cells) to 8. One might hope that the resulting test statistic would have approximately a chi-square distribution when the null hypothesis is true. However, that is not in general the case when maximum-likelihood estimation is used. It is however true asymptotically when minimum chi-square estimation is used.


Finding the minimum chi-square estimate

The minimum chi-square estimate of the population mean ''λ'' is the number that minimizes the chi-square statistic : \sum \frac = \sum_^8 \frac + \frac where ''a'' is the estimated expected number in the "> 8" cell, and "20" appears because it is the sample size. The value of ''a'' is 20 times the probability that a Poisson-distributed random variable exceeds 8, and it is easily calculated as 1 minus the sum of the probabilities corresponding to 0 through 8. By trivial algebra, the last term reduces simply to ''a''. Numerical computation shows that the value of ''λ'' that minimizes the chi-square statistic is about 3.5242. That is the minimum chi-square estimate of ''λ''. For that value of ''λ'', the chi-square statistic is about 3.062764. There are 10 cells. If the null hypothesis had specified a single distribution, rather than requiring ''λ'' to be estimated, then the null distribution of the test statistic would be a chi-square distribution with 10 − 1 = 9 degrees of freedom. Since ''λ'' had to be estimated, one additional degree of freedom is lost. The expected value of a chi-square random variable with 8 degrees of freedom is 8. Thus the observed value, 3.062764, is quite modest, and the null hypothesis is not rejected.


Notes and references

{{reflist


External links


"Minimum Chi-Square, Not Maximum Likelihood!", by Joseph Berkson
Estimation methods Statistical hypothesis testing